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Abstract — We study the problem of interpolating all values 
of a discrete signal / of length N when d < N values are 
known, especially in the case when the Fourier transform of 
the signal is zero outside some prescribed index set J; these 
comprise the (generalized) bandlimited spaces M J . The sampling 
pattern for / is specified by an index set X, and is said to be a 
universal sampling set if samples in the locations X can be used 
to interpolate signals from M J for any J . When N is a prime 
power we give several characterizations of universal sampling 
sets, some structure theorems for such sets, an algorithm for 
their construction, and a formula that counts them. There are 
also natural applications to additive uncertainty principles. 

Index Terms — Compressed sensing, Discrete Fourier trans- 
forms, Discrete time systems, Interpolation, Sampling methods, 
Uncertainty 

I. Introduction 

IN this paper and in a sequel [ 1 1 we consider the problem 
of interpolating all values of a discrete, periodic signal 
/: Zjv — > C, N > 2, when d < N values of / are known. 
One solution is a discrete form of the classical Nyquist- 
Shannon theorem, where the spectrum of the signal is assumed 
to vanish outside a contiguous band of frequencies; see [2 |, for 
example. At the other extreme is the new and important area 
of compressed sensing, where no assumptions on the spectrum 
are made. For this, of the many papers we mention only 0, 
PI and 0, since we will refer to this work later. 

Our approach to the problem is in between, though we begin 
by formulating a very general definition. 

Definition 1: Let Y be a d-dimensional subspace of C N , 
let X c [0 : N — 1] be an index set of size d, and let Ux = 
{ui : i G X} be a set of d vectors in Y. We say that (X,Ux) 
is an interpolating system if each / G Y can be written as 

f = Y,f(i)u i . (1) 

iei 

We call X a sampling set and Ux an interpolating basis. When 
we refer simply to a sampling set we always mean that it is 

Manuscript received August 31, 2011, revised January 20, 2012. A. 
Siripuram was supported by a Stanford Graduate Fellowship. W. Wu was 
supported by the Frank and Eva Buck Foundation 

W. Wu is currently at the Jet Propulsion Laboratory, Pasadena, CA. 

The authors are listed alphabetically. 

Copyright (c) 2011 IEEE. Personal use of this material is permitted. 
However, permission to use this material for any other purposes must be 
obtained from the IEEE by sending a request to pubs-permissions@ieee.org. 



associated with an interpolating basis. If the vectors ui are 
orthogonal we say that (X,Ux) is an orthogonal interpolating 
system and that Ux is an orthogonal interpolating basis. 

The point of the definition is that the interpolation of all 
values of / uses the sampled values f(i), i G X, which might 
be thought of as measurements of / with respect to the fixed, 
natural basis of the ambient space C , while the basis Ux 
is tailored to Y and iQ Note that X need not consist of 
uniformly spaced indices, so the sampling may be irregular. 
Indeed, the results described here and in JT| were originally 
motivated by questions from colleagues in medical imaging 
who had observed that irregular sampling patterns could often 
give excellent results with less computation. 

For us, to solve the interpolation problem for Y is to find 
an interpolating system. It is a linear theory in all aspects. 
Every subspace has an interpolating system, though it may 
not be unique, but not every subspace has an orthogonal 
interpolating system. For a given subspace it is also not true 
that any index set is a sampling set for some interpolating 
basis, so the intervals between samples are not arbitrary. The 
only subspaces that have orthonormal interpolating systems 
are the coordinate subspaces. All of this is discussed in Section 
iHl Orthogonal interpolating systems are the subject of [ 1 1 , and 
we find interesting connections with difference sets, perfect 
graphs, tiling, and we answer affirmatively a discrete version 
of a conjecture of Fuglede. 

In Section HI] we provide some basic results on interpolating 
systems in general. We quickly move, in Section [III] to study 
bandlimited spaces, B- 7 , defined as signals whose discrete 
Fourier transforms are supported on J . We do not require that 
J' be a set of contiguous indices, so this is more general than 
the situation in the discrete Nyquist-Shannon theorem (though 
we continue to use the term "bandlimited" for short). 

In Section [IV] we begin to concentrate on universal sam- 
pling sets, namely index sets X that are sampling sets for 
any bandlimited space M J with \J~\ = \X\. That is, X is 
universal if the sampling pattern specified by X can be used 
for interpolation of signals from any M J . Universal sampling 
sets were used in |4| for multicoset sampling and in (5] in 
connection with compressed sensing. Here our central result 
gives several necessary and sufficient conditions for an index 

'We could make the definition even more general and allow Y to be a 
subspace of any finite-dimensional vector space X, and sample / £ Y with 
respect to any fixed basis of X, but the present definition suffices. 
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set to be universal when N is a prime power. A mathematical 
consequence of our result is a generalization of Chebotarev's 
theorem on the invertibility of submatrices of the Fourier 
matrix. 

In Section [V] we show that a universal sampling set has 
an interesting structure as a disjoint union of what we call 
elementary universal sets, and through this analysis we are able 
to count the number of universal sampling sets of a given size. 
We also introduce maximal (and minimal) universal sampling 
sets which in turn enter naturally into the uncertainty principles 
that we discuss in Section [Vl] As an application of uncertainty 
and universality we prove a "random" uncertainty principle, 
and deduce a generalization of the Cauchy-Davenport theorem 
from additive number theory. Our debt to the work in |]6] 
and [3] is clear. Many of our results assume that N is a 
prime power, and naturally we wonder whether this can be 
generalized. 

The definitions we introduce and the methods we use are 
based primarily on properties of index sets when the elements 
are reduced modulo powers of a prime. With a few exceptions 
(e.g., minimal and cyclotomic polynomials) these can be 
considered elementary, and it is surprising (to us) how far 
they lead. The methods here also seem rather different from 
those of compressed sensing. In compressed sensing, which 
is nonlinear in theory and practice, the recovery of a signal 
from samples does not require knowledge of the frequency 
spectrum, whereas linear theories like ours cannot do without 
knowledge of the spectrum. Nevertheless, with universality 
the sampling patterns in our approach do not depend on the 
frequencies, the reconstruction of a signal from its samples is 
by linear operations, and the samples are "samples" in the 
classical sense instead of random projections of the signal 
onto a measurement basis as is done in compressed sensing. 
Both approaches start with discrete signals, but one needs to 
sample an analog signal in the first place and this analog 
sampling generally needs some knowledge of the frequency 
spectrum. Works such as [4] and [5 1 confront this issue through 
"spectrum blind" sampling, and they end up needing the idea 
of universality in the process. It is also interesting that the 
linear theory here can be used to prove a random uncertainty 
principle without the necessity of nonlinear techniques, though 
our result is not as strong as the result in |]3]. We hope to 
pursue the connections and differences further. We refer to [2] 
and [7 1 for additional results, discussion, and examples. See 
also Appendix [C] for references to papers on universality for 
continuous-time signals. 

II. General Properties, Existence of 
Interpolating Systems 

This section is a summary of elementary properties of 
interpolating systems, including existence theorems in both 
an algebraic and geometric formulation. The ideas are simple 
enough, but they fit together nicely and are an essential 
foundation for the less simple work to follow. 

We fix some notation. Without further comment we will 
identify a vector in with its A^-periodic extension and 
vice versa, and we typically index vectors from to N — 1. 



(We assume periodicity because the discrete Fourier transform 
will soon enter the picture.) For i £ [0 : N — 1] we let 
Sii 1>m — > C be the (periodized) discrete <5-function shifted 
to i, so that {8o,Si, . . . ,Sn-i} is the natural basis of C . 
The components of a vector in will always be in terms of 
the natural basis, but any fixed basis of would do for the 
following development. If X C [0 : N — 1] we let 

C 1 = span{<5i: i el}, 

Our first goal is to establish 

Theorem 1: Any subspace Y of C N has an interpolating 
system. 

We will give two proofs, one geometric and one algebraic, and 
both are straightforward. 

In the following, Y is always a subspace of dimension d and 
X is always an index set of size d. Let X' = [0 : N — 1] \ X. 
We record several facts. 

An interpolating basis for a subspace Y is trying to be the 
natural basis in the slots specified by the index set. In fact this 
is a characterization of interpolating bases. 

Proposition 1: (i) A basis U = {it, : i e 1} for Y is an 
interpolating basis if and only if 

(ii) Any natural basis vector 5k lying in Y is an element of 
any interpolating basis of Y. 

(iii) An interpolating basis is determined by its index 
set, more precisely, if {ui : i £ X} and {vi ; i 6 1} are 
interpolating bases for Y then Ui = Vi. for all j G I 

Expanding on the first point in Proposition Q] the elements 
of an interpolating basis are perturbations of the natural basis 
vectors by vectors outside Y: 

Proposition 2: (i) Any interpolating basis {ui : i £ 1} of 
Y is of the form 

Ui = 5i+ Vi, 

where Vi £ C 1 . If Vi £ Y then Vi = 0. 

(ii) The subspaces of C N having an orthogonal interpolating 
system are of the form Y = span{<5i + Vi : i £ X} where the 
nonzero Uj are orthogonal vectors in C 1 . 

We omit the proofs of Propositions Q] and [2] Part (ii) of 
Proposition [2] can be applied in the negative to find examples 
of subspaces that do not have an orthogonal interpolating basis 
- this is a much larger topic - and it also follows from part (ii) 
that the only subspaces having an orthonormal interpolating 
basis are the coordinate subspaces. Both of these points were 
raised in the introduction. 

Geometric Proof of Theorem\J} It is easy to see that there 
is an index set J of size N — d such that C N =¥(BC J . Let 
P : Y © C J -> Y be the projection of C N onto Y along C J . 
If / £ Y then, on the one hand, 

N 

/ = £/«*■ 

On the other hand, since <C J = kerP and Pf = / we have 

JV 

i=l igj 
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Thus the Ui — PSi form an interpolating basis of Y indexed 
byX=[0:N-l]\J. ■ 

We see from this why an interpolating basis need not be 
unique. The ambiguity in choosing an interpolating basis arises 
from the ambiguity in choosing a complement; if there is not 
a unique choice of the complement C J of Y, and generally 
there is not, then there is not a unique interpolating basis for 
Y. However, the existence of an interpolating basis produces 
a complement to Y: 

Proposition 3: Let U = {ui : i £ X} be an interpolating 
basis of Y. Then = Y © C 1 '. 

Proof: If we show that Y n C 1 ' = {0} then U U {6j : j G 
X'} forms a basis for C N . For this, let / G Yn C 1 ' . Then 

/ = £/(*)«i (2) 

iez 

because £/ is an interpolating basis for Y, and also 
Thus 

Let k £ X and evaluate both sides at fe: 

iei jei' 
f(k)=0. 

By 0> / = and we are done. ■ 
The algebraic proof of Theorem Q] is in terms of matrices. 
Associate with an index set X = 12, . ■ . , id} the N x d 
matrix Ex whose d columns are the basis vectors Si lt 5i 2 , . . ., 
Si d . If R is an a N x M matrix then E\R is d x M submatrix of 
R obtained by choosing the rows indexed by X. In particular, 
operating by E\ on an TV-vector / produces the d- vector with 
components f{i\), /(ia),. . ., f{id)- If R is an M x AT matrix 
then REx is the M x d submatrix of R obtained by choosing 
the columns indexed by X, 

We note three general facts. First, EJ-Ex = Id, where Id is 
the d x d identity matrix. Second, if S is a d x d matrix then 
El(RS) = (ElR)S . Finally, if U = {u n , u l2 , . . . , u id } is a 
basis for Y and U is the N x d matrix whose columns are the 
Ui then the condition ([T]i that U be an interpolating basis can 
be written in matrix form as 

/ = UE J x f (3) 

for all / £ Y. Here UE\ is an N x N matrix and we see that 
U is an interpolating basis for Y with sampling set I if and 
only if Y = ker(/ A r - UE\). 
Now we have 

Algebraic Proof of Theorem Q} Take any basis V = 
{vi, i>2) • • • 7 of Y and let R be the N x d matrix whose 
columns are the basis vectors Vk', thus Rjk = Vk{j)- Since R 
has rank d it has a d x d invertible submatrix, and possibly 
many such submatrices. Let I be the index set corresponding 
to the d rows chosen from R to form the invertible submatrix 
E\R. The columns of the N x d matrix R(EjR)^ 1 are again 



a basis of Y. We write them indexed by 

I. Since 

the Ui. are as in Proposition [TJ and hence comprise an 
interpolating basis of Y. ■ 

This proof shows how to produce an interpolating basis pro- 
vided one can find a d x d invertible submatrix E\R, indexed 
by X, The more such submatrices the more interpolating bases 
for Y. On the opposite side, in general not every index set X 
is sampling set for an interpolating basis since, in general, not 
every choice of a d x d submatrix is invertible. 

A slightly different way of arranging the algebraic proof 
also gives an interpolation formula, making (f3) more explicit. 
As above, let V = {vi, V2, . . . , Vd} be a basis of Y and let R 
be the corresponding N x d matrix. If / G Y then 

N d 

f = y~] f( k ) S k and also / = ^ a k v k , 

k=l fe=l 

for some constants a k - We want to solve for the ctk in terms 
of d of the values f(k). Write the second equation for / as 

f = Ra, a — (ai,a 2 , . . . ,a d ) J . 

Now R has an invertible d x d submatrix, say E\R for an 
index set X, and so 

Elf = E J x (Ra) = (E x R)a. 

We can then solve for a via 

a={E I R)- 1 {E I f), 

resulting in 

f = R{ElR)-\Elf). (4) 

This equation writes / in terms of the components f(i), i £ X. 

Carrying the algebraic line of reasoning a little further, we 
also see how two interpolating bases for Y are related to each 
other. 

Theorem 2: Fix an interpolating basis of Y, indexed by J, 
and let R be the corresponding Nxd matrix. If S is the matrix 
of another interpolating basis of Y, indexed by X, then E\R 
is invertible and 

s = r(eIr)- 1 . 

Proof: Let {v i : i £ X} be the interpolating basis of Y that 
are the columns of S and let {uj : j £ J} be the columns of 
R. Since the Uj are an interpolating basis we can write, for 
each i £ X, 

Vi = Vi(j)Uj. 

jej 

In matrix form this is 

S = R(E J jS). 
Now multiply on the left by E\, resulting in 

ElS = El(R(E J jS)) = (ElR)(E T jS). 

But ElS is the d X d identity matrix, so this shows that 
is invertible, that (E^i?) -1 = EjS, and then that 

s = r(eIr)- 1 . 
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Finally, we look a little more closely at the interpolating 
basis provided by R(Ej : R)~ 1 in relation to the geometric 
construction. From the d x d matrix (E^R)^ 1 form a d x N 
matrix by adding N — d columns of zeros in the slots I'. Call 
this matrix T. Then RT is an N x N matrix and one sees that 



RTSt 



i e X 
iei' 



Thus RT is the projection of C N onto Y along C x and we 
are back to the idea of the geometric argument. Observe that 
whereas the geometric argument started with a complement 
C 1 to Y and produced the interpolating basis via projection, 
here we started with an interpolating basis for Y and produced 
the projection and the complement. 

III. Discrete Bandlimited Spaces 

Bandlimited signals are defined by the vanishing of the 
discrete Fourier transform outside a set of specified indices. 
They form a particularly interesting class of subspaces. 

For notation, let 

C = e~™' n , 

simplified to just £ when n — N, and let w: Zjy — 
the discrete complex exponential, 

w(m) = C™ 
The discrete Fourier transform is then 

N-l 



be 



n=0 

As usual, we also regard F as an N x N matrix whose 
mn-entry is F, nn — ui n (m) — £' nn . We recall that F^ 1 = 
(1/N)F* (the adjoint of F). 

Definition 2: Let J C [0 : N - 1]. The | J | -dimensional 
space of bandlimited signals with frequency support J is 

M J = F~ l {C J ). 

In words, / £ B- 7 if Ff has zeros in the slots J 1 = [1 : 
N] \ J . There might be more zeros of Ff for a given / but 
there are at least these zeros. We do not assume that the indices 
in J are contiguous, so Ff is not necessarily supported on 
a "band" of frequencies, but we maintain the use of the term 
"bandlimited" in all cases. 

Since F*S n (m) = £ _mT \ we get a basis for M J by 
pulling out of F* the columns indexed by J. Thus we get 
an interpolating basis with sampling set X if and only if 
E\F*Ej is invertible, or equivalently if and only if E\FEj 
is invertible. We prefer to use the latter, with F instead of F* . 

For the remainder of this paper, interpolating systems for 
bandlimited spaces will be our main concern. Spaces of 
bandlimited functions having orthogonal interpolating bases 
are the subject of [lj, but we do have one general observation 
here: such spaces cannot be too big. 

Proposition 4: If B^ 7 has an orthogonal interpolating basis 
then \ J\ < N/2. 



Proof: Suppose B- 7 has an orthogonal interpolating basis 
indexed by X. Then \X\ = \ J\. Let T = [0 : N - 1] \X. By 
Proposition |2] we can write 



iJ - 



= span{<5i 



l£l), 



where the vi are orthogonal vectors in C x , or some possibly 
0. But none of the can be zero, for FS^ — uj~ k which never 
vanishes. There are \1\ of the v's, and if \J\ = \I\ > N/2 then 
< N/2 and we would have more than N/2 orthogonal 
vectors in a space of dimension less than N/2. ■ 

A. Necklaces and Bracelets 

Sampling sets for bandlimited spaces have more algebraic 
structure than it might appear. Namely, the property of being 
a sampling set for a particular B- 7 is preserved under the 
action of the dihedral group. To explain, on TL-^ we denote 
the operations of translation (by 1) and reflection by r and p, 
respectively: 



t : Zjv 
p: 7L N 



r(n) 
pin) 



n — 1 mod N, 
—n mod N. 



Then 



t = id. p = id and prp = r or (pr) = id, 

so r and p generate the dihedral group Dihjv. Clearly Dihyv 
can act on an index set 1 via 

tX = {r(i): i £l}, pX={p{i):i£X). 

We define the bracelet of X to be the orbit of X under the action 
of Dili 7v- The necklace of X is the orbit of X under the action of 
the cyclic subgroup (r) of DiliAr. Think of X c [0 : N — 1] as 
specifying a pattern of N beads on a loop, with black beads in 
the locations in X separated by white beads in the locations in 
the complement X ', as in Figure Q] A necklace is worn around 
the neck, and if the cyclic group acts then the spacing of the 
black and white beads is the same however the necklace is 
rotated. But a bracelet can be worn on either wrist, introducing 
a reflection, and the symmetry group is Dili at. See AppendixlBl 
for a formula that counts distinct bracelets, and for references. 
With these definitions we now have 

Proposition 5: If I is a sampling set for B- 7 then any index 
set in the bracelet of I is a sampling set for B- 7 . 

Proof: Let X = {mi, m 2 , ■ ■ ■ , m d }, J = 
{ni, 7i2, . . . , rid] an d let fC = tX. Then the new submatrix 
Ej-FEj is given by 



'£(mi— l)m £(mi— 1)«2 
^('"2-1)11 ^(m 2 -l)n 2 



(~(mi-l)na 
£(m 2 -l)n d 



£(m<i-l)ni ^(m d -l)n 2 . . . ^(m d -l)n d 
'^mini £miri2 . . . Qn\n d 



Qn d nx ^m d n 2 



£rn d n d 
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Fig. 1: Two different bracelets with N = 12 and \X\ = 4. On 
top the index set is X — {0, 2, 5, 7}, on the bottom the index 

set is X = {0, 3, 5, 6} 

~C ni ••• 
<T" 2 ••• 

o o o ••• c nd _ 

= (E^XEj) x an invertible diagonal matrix. 

Hence E\XEj is invertible whenever E\XEj is, and the 
same is true for any translation of X. 

Next suppose JC is obtained by reversing I, namely K, = 
{N — mi,N — iri2 7 ...,N — nid}- Then E^XEj is just 
the conjugate of E\XEj, so again, E\XEj is invertible 
whenever E\XEj is. ■ 

IV. Universal Sampling Sets 

There is a kind of interchange duality for bandlimited spaces 
between sampling sets and frequency support sets. On the one 
hand, the sampling problem is to start with M J and ask which 
index sets X are sampling sets. On the other hand, one could 
also start with an index set X and ask which B- 7 result from 
this sampling pattern. These two questions are equivalent. 

Proposition 6: B- 7 " has I as a sampling set if and only if 
B 1 has J as a sampling set. 

Proof: The subspace B 1 - 7 " has I as a sampling set if and 
only if EjXE j is invertible, and this is true if and only if its 
transpose EjXEx is invertible. ■ 

Though the sampling problem may seem the more natural 
one, we will concentrate on the second, equivalent question 
and ask which frequency patterns, that is which B" 7 , can arise 



from a given sampling set X. It may be that the space M J is not 
known exactly, or that we may have some erroneous estimate 
3 of J. The question is whether we can pick sampling 
locations X that are robust for these estimation errors. We will 
find some interesting phenomena, and the results can easily 
be translated to apply to the sampling problem. The extreme 
case is captured by the following definition. 

Definition 3: An index set 1 C [0 : TV — 1] is a universal 
sampling set if I is a sampling set for each B J with \J\ = \X\. 
See also and 0. 

If I is a universal sampling set, then while an interpolating 
basis of a space B- 7 still depends on J, where the samples are 
taken does not depend on J . In Section [V] we will show that 
there are universal sampling sets of any given size; in fact, we 
will count them. 

Very concretely, to ask if I is a universal sampling set 
is to ask if there are rows of X indexed by X, \X\ = d, 
such that any d x d submatrix of X formed with these 
rows is invertible. Phrased this way, standard properties of 
Vandermonde determinants applied to X allow us to conclude: 

Proposition 7: (i) If I is a set of d consecutive indices, 
reduced mod N, 

I = {io,ia + 1) • • • ) io + (d — 1)} mod N, 

then I is a universal sampling set. 

(ii) If I is a set of d indices in arithmetic progression, 
reduced mod N, 

X = {io, io + s, i + 2s, . . . , io + (d - l)s} mod N, 

where s is coprime to N, then I is a universal sampling set. 

■ 

Much deeper is the following theorem of Chebotarev. 

Theorem 3: (Chebotarev) If N is prime, then every square 
submatrix of X is invertible. 

And so, if N is prime then any index set I is a universal 
sampling set. Chebotarev's theorem dates to 1948 (the original 
paper is in Russian) and there are now several published (and 
unpublished) proofs, see, e.g., J8), (3, but this is by no means 
a trivial result. 

We will generalize Chebotarev's theorem when N is a prime 
power, and we will offer several characteristic properties of 
universal sampling sets. We are indebted to the works of Tao 
[6 1 and Delvaux and Van Barel iflOl . 

The key is a quantitative, almost statistical comparison of 
X to the simplest universal sampling set, 

X* = [0 : d- 1], 

when the elements of both X and X* are reduced modulo prime 
powers. We need several additional definitions to state our 
main results. 

A. Multisets and the Size of Congruence Classes 

We have found it conceptually helpful to use multisets in 
the description of one of the central ideas, and we briefly 
review this concept. Informally, a multiset is a finite, unordered 
list A whose elements are drawn from a finite set A, and 
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where, to distinguish a multiset from simply a set, elements 
of the list may be repeated. More formally, a multiset is a pair 
(A, xa) where \A is the multiplicity function (generalizing 
the characteristic function): 

XA- A— >N, 
Xa{o) = the number of times a E A is listed in A. 

Two multisets A and B are equal if xa — Xb> so the 
individual elements are the same and so are their multiplicities. 
The cardinality of A is 



\A\ = Y,XA(a) 



a£A 

It is common practice to use the standard set notation in 
writing a multiset. Thus, for example, drawing from {a, b, c, d} 
we write a multiset as {a,a,c,d,d}. The tilde notation A 
for a multiset drawn from A is helpful in discussing general 
principles but, like all general notations, it has its limitations 
in particular cases. It is a notation often used for covering 
spaces, as we comment on below. 

Associated with a multiset A is another multiset 

M(A) = {xa{o): aeA}, 

which we call the multiplicity multiset of A. Thus A4(A) 
records as a multiset the counts of the elements of A and also 
includes a zero for each element of A that does not appear in 
A. One can think of M. (A) as providing some statistics of A, 
a kind of histogram of A with bins from A, except that the 
bins are not ordered. 

Next, let p be a prime, k > an integer, and for x <E N let 
[x]k be the residue of x reduced mod p k . For an index set X 
let 

2/p k = {[i] k :iel} 

be the set of residues mod p k of the elements of 1, and let 
(I/p k )~ be the corresponding multiset, meaning that each 
residue is listed according to its multiplicity, i.e, the size of 
its congruence class. We regard the elements of (l/p k )~ to 
be drawn from [0 : p k — 1], all possible residues, and we 
write x_k ■ [0 : p k — 1] — > N for the multiplicity function 
for the multiplicity multiset M.((I/p k )~). To be explicit, for 
a e [0 : p k - 1] 

Xk (a) = the number of elements of 1 

that leave a remainder of a on dividing by p k . 



(5) 



In particular, Xk (a) = means that no element of X leaves 
a remainder of a on dividing by p k . In this case we speak of 
an empty congruence class in I/p k . For a £ [0 : p k — 1] it 
will also be helpful to use the notation 

Ika ={i6l: i = a mod p k } 

for the elements of the congruence class of a mod p k . Then 

Xfe(a) = \l ka \- 

When we need to emphasize the index set, especially in 
Section [V] we will write Xfc(a;I). We note the obvious 
properties: 

• If X and J are disjoint then Xk (a ; Z U J) = Xk(a;X) + 
Xk(a;J). 



. 1CJ => x k (a;l) <x k (a;J). 

Observe for k = that (Z/l)~ just consists of \1\ zeros 
and xo(0) = More generally, 



p -i 

E 

o=0 



Xk(a). 



(6) 



We also note that the multiplicity multiset A4((l/p k )~) de- 
pends only on the bracelet of X. While the multisets (I/p k )~ 
will generally change if 1 is shifted or reversed, the counts of 
the residues on dividing by p k will be the same: 



M((ri/ P k r) = M((i/p k n 
M((pi/p k r) = M((i/ P k r)- 



and 



(7) 



Remark 1: Introducing the multiset (I/p k ) r * J is reminiscent 
of introducing covering spaces (for Riemann surfaces) to 
resolve the problem of multivalued functions. Here we have 
the remainder map r: I — > I/p k , r(i) = [i], which is 
generally not injective and so has a multivalued inverse. Think 
of the residues (with multiplicity) in (I/p k )~ as tagged by the 
number they come from, say as a pair ([«],«), which serves 
to distinguish them much as we think of tagging points on 
different sheets of a covering space of a Riemann surface. 
Then we have the commutative diagram 

{i/ P k r 



i/ P k 



where pr is the projection map, i-> [i] and the lift r(i) = 

([«],«), of r is bijective. The value of the multiplicity function 
Xk(i) is then the number of elements in the preimage pr _1 ([i]), 
analogous to the number of sheets over [i). It will generally 
vary with [i]. 

Returning to our primary considerations, we write Xk t0 
distinguish the special case when 1 = 1*. We will need the 
following property of xX'- 




IxJ(o) - Xt(b)\ < 1, 



(8) 



for all a, b g [0 : p k — 1] and all k. In words, when reducing 
the elements of I* = [0 : d — 1] modulo p k for any k, the 
conjugacy classes are all of about the same size. Or, pursuing 
the analogy above, the preimages pr _1 ([i]) of the individual 
residues all have approximately the same number of elements 
and one might say that I* jp k is uniformly covered for each 
k. 

The inequality in ((HJ is easy to see. For some background 
calculations we have found it helpful to have a formula for 
Xl (from which © also follows). If i e 1* with [£] k = a e 
[0 : p k — 1] then £ — a + ap k for an integer a > 0, and since 
I < d — 1 we must have 0<a< (d—l — a) fp k . The number 
of integers a for which this inequality holds is the number of 
t whose residue is a. Thus 



Xk( a ) = 



d—l — a 



1 



(9) 
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B. A Characterization of Universal Sampling Sets 
Our main result is: 

Theorem 4: Let X be an index set in [0 : p M — 1]. The 
following are equivalent: 

(i) Xk = Xl for all < k < M. 

(ii) \Xk{a) - Xkib)\ < 1 for all a, b G [0 : p k - 1] and 
< k < M. 

(iii) X is a universal sampling set. 

According to Proposition and the relations (|7), any index 
set in the bracelet of X is also a universal sampling set. 
Likewise, any index set in the bracelet of I* can serve as 
a model universal sampling set. Only condition (i) directly 
compares X to X*, and in terms of multisets it could be stated 
equivalently as 

M((i/p k r) = M((r/p k n- 

Condition (i) for k = guarantees that X and X* have the same 
size, from ©. Computing M{{I/p k )~) and M((X*/p k )~) 
for k > M is redundant; since all elements in X and X* are 
in [0 : p M - 1], M{{X/p k )~) for k > M is just indicative 
of the cardinality of X and I*. Namely, for k > M, each 
of M((2/p k )~) and M{{X* /p k )~) contains \X\ ones and 
p — \X\ zeros. Condition (ii), a property only of X, indirectly 
compares X to X* via ®. It says that X/p k , like X* /p k , is 
uniformly covered for each k. 

Before we embark on the proof of the theorem, here is an 
example. Let N = 2 3 , and X = {0, 1, 3, 4, 6}. The following 
are the multisets for k = 1,2,3: 

(X/2)~ = {0, 1, 1, 0, 0}, M((Z/2)~) - {3, 2}; 
(X/2 2 r = {0, 1, 3, 0, 2}, M((X/2 2 D = {2, 1, 1, 1}; 
(Z/2 3 r ={0,1,3,4,6}, 

M((T/2 3 r) = {l,l,0,l,l,0,l,0}. 

The computations for X* = {0, 1, 2, 3, 4} yield 

M((I*/2)~)={3,2}; 



then gives 



(I*/2)~ = {0,1, 0,1,0}, 
(X*/2 2 )~ ={0,1,2,3,0}, X((I*/2 2 )~ 
(j*/2 3 r ={0,1,2,3,4}, 

^((X72 3 r) = {l,l,l,l,l,0,0,0} 



{2,1,1,1}; 



We see that 7W((I/2 fc )~) = M((X*/2 k )~) for fc = 1,2,3, 
and hence I is a universal sampling set. So in case the reader 
has ever wondered, for the 8x8 Fourier matrix any 5x5 
submatrix built from the rows indexed by X, or from the rows 
of an index set in the bracelet of X, is invertible. 

Proof of Theorem]4\ (i) <^=> (ii): Note: This equivalence 
does not require that N be a prime power. The implication (i) 
=>■ (ii) is immediate from (0. Assume (ii) holds and let 

X = minxfc(a). 

a 

From (ii) it follows that any A>( a ) i s either \ or X+l- Suppose 
r of the p k numbers \k (a) are equal to x + 1 and the rest are 
equal to \- The cardinality equation, ©, 



E 

a=0 



Xk(a) 



(10) 



p k X + r = d, with 0<r<p . 



This means that x is the quotient on dividing d by p k and r is 
the remainder. In other words, (ii) and dTPi together uniquely 
determine the multiset M((I/p k )~) = {x.k( a ) : a € [0,p fc — 
1]}. Since Z and Z* both satisfy (ii) and (fTOb , we must have 
M((l/p k r) = A4((Z*/yn, or x fc = Xl- ■ 

We need two lemmas to prove that condition (i) implies that 
I is a universal sampling set. The first is a very old theorem 
on Vandermonde determinants, ifTTI . as updated in ifTOl : 

Lemma 1 (Delvaux and Van Barel): Let 



V = 



(ID 



be a d x d generalized Vandermonde matrix. Then the deter- 
minant of V is given by 



detV 



Xi) S(x 1 ,x 2 ,-.-,x d ), 



(12) 



where S(x%, . . . ,Xd) is a symmetric polynomial in 
x\,X2, ■ ■ ■ Xd with integer coefficients such that 



5(1,1, 



1 



Ylo<i<j<d-lU 

The polynomial S is called a Schur polynomial, see, for 
example, [12 |. Based on this lemma we deduce a second result 
that is itself already a sufficient condition for an index set to 
be a universal sampling set. 

Lemma 2: Let X — {mo, mi, mz, ■ ■ ■ , md—i}- If 

rio< 4 <j<<i-i( TO i- m *) 



n 



0<i<j<c 



(13) 



is coprime to p, then X is a universal sampling set. 

Note that without Lemma [T] it would not even be clear that 
fi is an integer. An intuitive idea for why this should be so is 
given below. The proof of Lemma [2] is along the lines of the 
proof of Chebotarev's theorem in |[T3l , and also in |l6l . 

Proof of Lemma [2} We make use of Lemma [T] in the 
case when V = E^TEj. Each xi in dTTb is then a power of 
C = e- 2m / N , x, = where J = {j u j 2l . . . , j d }. 

Suppose det V = 0. From ( fT2l . this means that 
5(a;i,a;2, . . . ,Xd) = 0. Substituting xe = C Jf i n 
5(a;i, £2, • • • , Xd) = 0, we obtain an equation of the form 
s(C) = 0, where s(x) is a polynomial in one variable with 
integer coefficients. This means that £ is a root of s(x) and 
since s(x) has only integer coefficients, s(x) must contain the 
minimal polynomial of £ over Z as a factor. 

For AT = p M , the minimal polynomial of £ over Z is 
0jv(a;) 



1 



.sp- 



as 



(p-i)p A 



(the A'th cyclotomic polynomial). So we have 4>n(x) \ s(x), 
where 



<Pn{x) 



1 



+ x 



2p* 



,(P"1)P A 
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Now 4>n and s are both polynomials with integer coefficients, 
hence </>jv(l) I s(l). However, 0jv(l) = P> and s(l) = 
S(l,l,...,,l) =/i. Thus 

p\fx if det V = 0. 

This proves the lemma. ■ 
Chebotarev's theorem follows from this result. If N is a 
prime p then fi is coprime to p because every factor in the 
numerator and denominator of /i is an integer strictly between 
— p and p. 

We can now complete the proof of one direction of the 
implications in Theorem |4] 

Proof of Theorem ^ (i) ==> (Hi): Let X = 
{mi,ra2, ms, . . . , rrid} and consider the product of differences 

A = Yl (rrij - mi). 

l<i<j<d 

There are Xk{$) elements of I that leave a remainder of 
I when divided by p k . Moreover, = rrij mod p k if and 
only if p k | (rrij — mi). The number of differences that have 
a factor of p k (or higher: p r for r > k) is 



p re — 1 

E 



Mi) 

2 



and hence the number of differences that have a factor of 
exactly p k is given by 



p*-i 

E 

z=o 



Xfc(0 
2 



E 

i=0 



Xfe+i(0 

2 



The largest power of p that divides A is then p raised to 




(14) 



The expression (fl4l i depends only on the values of Xk, but the 
hypothesis is that \k — Xk f° r < fc < TV, and therefore the 
products A = Yl( m j ~ m i) anc l B = 110' — nave me same 
powers of p as factors. Hence \i = A/_B is coprime to p and 
from Lemma [2] we conclude that I is a universal sampling 
set. ■ 
Remark 2: The argument above also gives an insight, if not 
a proof, as to why fi = A/B in ( fT3l ) is an integer. Suppose 



M((2/p k )~) = {ri,r 2 ,r 3 
Y[( m i ~ m j) is given by 

d 

E 



. , r<j}. The power of p in A = 




Now, 2i=i r « ^ the cardinality of I so 



d 



Hence for a set I which has the minimum power of p k in A 
it must be that Ai((I/p k )~) — {ri, ■ ■ ■ rd} is a solution 
to 



minimize r\ 



subject to n + r 2 + 



r d = d. 



On the reals the optimal solution satisfies ri = r 2 = • • ■ = Td- 
This suggests that the set I with the smallest power of p k in 
A must have roughly an equal number of elements in each 
congruence class. I* = {0, 1, 2, . . . , d — 1} is one such set. 
Thus the power of p k is smaller in B — Y\{i — j) than in 
A = Jl( TO i ~ m j) f° r eacn P an d ^> an( i> if the reasoning is 
to trusted, /i = A/B is an integer. 

To finish the proof of Theorem |4] we will derive the 
following bounds on Xk- 



Lemma 3: If I C [0 
of size d then 



p — 1] is a universal sampling set 



€ [0 : /' - 1] , < k < M. 



(15) 

It follows immediately from ( fT~5T > that if I is a universal 
sampling set then 

IXfc(o)-5ft(6)|<l, o,6G[Q:p*-l]. 

This is condition (ii), and with this result the proof of Theorem 
|4]will be complete. Incidentally, for the case 1 = 1*, (TT~5T > is 
a simple consequence of (O and ([8]). 

The argument for Lemma [3] is through constructing subma- 
trices of the Fourier matrix of known rank to obtain upper 
and lower bounds for Xk- The first step is to build a particular 
model submatrix, and this requires some bookkeeping. 

Let I C [0 : p M — 1], at this point not assumed to be a 
universal sampling set. Fix k < M and s £ [0 : p k — 1], and 
recall that we let 

Iks = {i el: i = s mod p k }. 

The set X^ s has Xk(s) elements. List them, in numerical order, 
as io, ix, i 2 , ■ ■ ■ , i c < where we put c = Xk(s) — 1 to simplify 
notation. Let r be a positive integer and define the column 
vector of length c by 

/ -i 1 r ^i 2 r f-i c r~l T 



3 



= [cr 



"AT Sjv 

Now let y be the c x p k matrix obtained by repeating p k 
copies of the column i r : 

y = [f f f •••3 r ], 

V 

p k times 

and let T) s be the p fc x p k diagonal matrix 

"10 ... 





o C> o. 
o o c 2 







(p k -l)s 

pk 





Finally, let k' — M — k, and set 

J k >r = {0 • / + r, 1 • + r, 2 • + r, . . . , - + r}. 

(16) 

From the Fourier matrix T we choose c rows indexed by Ik s 
and p fc columns indexed by Jk' r - The result of these choices, 
we claim, results in 



(17) 
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After the preparations, the derivation of (fl7l is straightfor- 
ward. The (a, &)-entry of E\ TEj , is 



A a (bp k +r) 



exp 



exp 



{2TTi)i a (bp 



M—k 



(2m)i a b 



exp 



{2m)i a i 



P 



But now recall that, by definition, when i a g X ks is divided 
by p k it leaves a remainder of s, and thus 



exp 



exp 



(2ni)i a b 

pk 

(2m)sb 

pk 



exp 
exp 



{2iri)i a i 
p M 
(2-!ri)i a r 



P 



M 



/■sb /-i a r 
SAT • 



This construction is the basis for the proof of Lemma [3] but 
applied in block form. 

Proof of Lemma \3\ To deduce the upper bound x(s) < 
\d/p ~\ we begin by letting 



J = Jk>o U J kn U J k , 2 U • • • U J, 



k'd' 



d! = 



p h 



- l, 



where J^ r is defined as in (1161 . Note that J is a union of 
\d/p k ~\ disjoint sets. Each j£, r , < r < d' = \d/p k ] - 1 
indexes the choice of p k columns from T and applying (fTTT i 
we have 



rs) s o 



= [3° 3 1 ••• 3 d '] 



D s 












The diagonal matrix in this product is invertible, hence 
Rank of El TEj = Rank of [3 3 1 3 2 ••• 



< Number of distinct columns 



(18) 



Now, the number of columns of Ej- TE j is equal to 

\J\ = \ Jk'Q U Jk'l uj M u...u j Wd! I 

\d/p k ]-l 

= ]T \Jk>r\=p k \d/p k ]>d, 
r=0 

so there are at least d columns. Hence if I is a universal 
sampling set of size d then E\TE j must be of full row rank. 
In particular, since X ks Q X, it must be that Ej- TEj is 
also of full row rank, for each s. Next, the number of rows in 
E\ J-Ej is equal to \Ik s \ — Xfe( s ) by definition. From (fT8l 
we know that the rank of E\TEj is at most \d/p k ~\, and so 
we have 

~ r d 

(Number of rows) Xfe(s) < 



The proof of the lower bound Xfc( s ) > [d/p k \ is very 
similar. This time we construct a set J with \J\ < d, and 
observe that if X is a universal sampling set of size d, then 
EjTEj is of full column rank. 

Let 



J = Jk'o U J kn U J k , 2 U . . . U J k , 
Then just as above, 



d" = [d/p k \ - 1. 



Rank of ElTEj = Rank of [3 3 1 3 



< Number of distinct columns = 



The number of rows of E\ k TEj is \X ks \ = Xk( s )> an d so 
we must have 

Rank of E\TEj < min{ [d/p k \ , Xk(s)}- (19) 

Furthermore, 



ElTEj = 



E ts E J 



El FEj 



whence 



Row rank of E\TEj 

< Rank of E\ m TEj + Rank of E} ki TEj 



Rank of E- 



< ram{[d/p k \, X k(s)}. 



(20) 



s=0 



Now, the number of columns indexed by J is p k \d/p k \ < 
d. Hence if I is a universal sampling set of size d, we need 
EjTE j to be of full column rank. From (|20K this means we 
must have 



(Number of columns) p k [d/p k \ < min{ [d/p k \ , Xfe(s)}. 

s=0 

This inequality will not be satisfied unless \d/p k \ < Xk(s) 
for all s. This completes the proof. ■ 
Remark 3: For many values of d, it is enough to prove one 
side of the inequality (15[ . If we know that Xk(s) < \d/p k ~\, 
then from ^ s Xk( s ) = d an d a recurrence relation (123) . below, 
it is possible to prove that [d/p fc J < Xfe( s )- Such cases include 

1) N = p M , d = c oP k +c 1 p k - 1 for c , ci e {0, 1,2,... ,p- 
!}• 

2) N = 2 M , d = c Q 2 k + c^- 1 + c 2 2 k - 2 for c ,ci,c 2 £ 
{0,1} 

3) JV = 2 A/ , d = 2 k + 2 fc " 1 + 2 fe " 2 + . . . + 2 k ~ r+1 for some 
r, 
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C. Digit Reversal and Universal Sampling Sets 

There is another interesting characterization of universal 
sampling sets in terms of digit reversal. Expanding in base p, 
any integer a € [0 : p m — 1], m > 1, can be written uniquely 

as 

a = a + aip + a 2 p 2 H a m _ip m_1 , 

where the a's are in [0 : p — 1], We define a permutation 

7T m : [0 : p m - 1] — ► [0 : p m - 1] by 

7Tm(ao + aip + a 2 p 2 H a ro _ip m_1 ) = 

a m _i + a m _ 2 p + a m -3P 2 H h aop m_1 . 

The a's are the digits in the base p expansion of a £ [0 : 
p m — 1] and applying 7r m to a produces the number in [0 : 
p m — 1] with the digits reversed. For example (an example 
we will use again in Section IVT). take [0 : 7], Then 7r3([0 : 
7]) = {0, 4, 2, 6, 1, 5, 3, 7} in that order. Such digit reversing 
permutations were used in ifTO) to find rank-one submatrices 
of the Fourier matrix. 

The issue for universal sampling sets is how the numbers 
■km (I) are dispersed within the interval [0 : p M — 1], where, 
as before, N = p M . To make this precise, take k > 1 and 
partition [0 : p M — 1] into p k equal parts: 

[0 : p M - 1] = (J [ap k> : (a + l)p fc ' - 1], k' = M — k. 

a=0 

For any J C [0 : p M - 1] and a e [0 : p fc - 1], let 

4>k{a;J) = \Jn[ap k ',(a + l)p k ' - 1]|. 

We say that J' is uniformly dispersed in [0,p M — 1] if 

|0k(o;J)-^(6;J)|<l (21) 

for all a, b £ [0 : p fc -l], and 1 < fc < M. Thus J' is uniformly 
dispersed if roughly equal numbers of its elements are in each 
of the intervals [ap k ' : (a + l)p k ' - 1] for all 1 < k < M, 
k' = M — k. 
We will show 

(j) k {K k {a)-K M {l)) =Xk{a), a £ [0 : p k - 1]. (22) 

Thus, to the three equivalent conditions in Theorem |4] we can 
add a fourth: 

(iv) km {I) is uniformly dispersed. 

The derivation of ( f22l uses the following lemma. 

Lemma 4: If j € [0 : p M — 1] is given by j = b + ap k , 
< b < p k ' - 1, then K M {j) = K k (a) +p k K k ,(b). 

The proof is straightforward, and the argument for (l22l then 
goes very quickly. As defined, for any index set J, 4> k (a ; J) 
is the number of elements in J that lie in [ap k : (a+l)p k — 1], 
and these are precisely the j e J of the form ap k + b with 



< b < p k ' - 1. Thus for i e [0 : p k - 1], 

4>k(K k (i) ; k m {I)) = 

the number of j e 7Tm (I) 

of the form K k (i)p k ' +b, < b < p k ' - 1 
= number of j £ 1 of the form 

p k K k ,{b) +i, <b <p k ' - I (from Lemma gjl 
= number of j € I that leave a remainder of 

i on dividing by p fc 
= Xk(i)- 

V. Structure and Enumeration of Universal 
Sampling Sets 

In this section we analyze in detail the structure of universal 
sampling sets. Specifically we show that when N — p M is a 
prime power such a set I is the disjoint union of smaller, 
elementary universal sets that depend on the base p expansion 
of The method is algorithmic, allowing us to construct 
universal sets of a given size, and to find a formula that 
counts the number of universal sets as a function of p M 
and |T| . In particular the formula answers the question: How 
likely is it that a randomly chosen index set is universal? 
Not very likely, but there are several subtle aspects to the 
answer. For example, we exhibit plots of the counting function 
showing some striking phenomena depending on the prime p. 
Our approach is via maximal universal sampling sets which, 
in turn, enter naturally in studying the relationship between 
universal sampling sets and uncertainty principles. We take 
up the latter topic in the next section. 

A. A Recurrence Relation and Tree for \ 

When N = p M the condition that an index set be a universal 
sampling set depends on the values of Xk for different k. To 
study this we use a recurrence relation in k for Xfe(a). The 
formula holds even when N is not a prime power. 

Lemma 5: Let 2 C [0 : N - 1]. Then 

p-i 

Xfc-i(a) = ^Xfc(a + j/~ 1 ), (23) 

3=0 

for all a e [0 : p^ 1 - 1]. 

Proof: An integer x £ 1 that leaves a remainder of a 
when divided by p k ~ 1 is of the form x = ap k ~ 1 + a. Let 
a — j3p + 7 for 7 g [0 : p — 1], Then x — (3p k + ■yp k ~ 1 + a, 
that is, x leaves a remainder of either ■ p k ~ 1 + a, 1 • p k ~ 1 + 
a, 2 ■ p k ~ 1 + a, . . . or (p — 1) ■ p k ~ 1 + a on dividing by p k . 
The result follows. ■ 

When N — p M the recurrence formula and the relation 
it expresses between conjugacy classes has an appealing 
interpretation in terms of a p-ary tree. Several arguments in 
this section will be based on this configuration. 

Let IC [0 : p M — 1]. We construct a tree with M + l levels 
and p k nodes in level k, < k < M. The nodes in level k 
are identified by a pair (k, a), with a £ [0 : p k ~ 1 ]- Call the 
nodes at the level M the leaves. At the node (k, a) we imagine 
placing the congruence class I ka — {i £ I: i = a mod p k }. 



11 



The root is Zoo = T- and the nodes at the leaves host the sets 
1-Ma, a G [0 : p M — 1], each of which is either a singleton or 
empty. We assign a weight of Xfc( a ) = \%ka\ to the node (fe, a). 
Further, at each level we arrange the nodes according to the 
digit reversing permutation, i.e., nodes at level k are arranged 
as 7Tfc([0 : p k — 1]), where 7Tfc is the digit reversing permutation 
from Section IIV-CI (This is similar to the starting step of the 
FFT algorithm, where the indices are sorted according to the 
reversed digits.) Figure |2]shows the case N = 2 3 , a binary tree 
with four levels, k — 0,1,2,3. In the third level of the tree 
the nodes are ordered 0, 4, 2, 6, 1, 5, 3, 7, which is ^([0 : 7]). 
Then: 

1) The set X^ a at level k is the disjoint union of the sets at 
its children nodes at level fe + 1. 

2) The value of Xfc( a ) at the node (fe, a) is the sum of the 
values of Xfc+i at its children nodes at level fe+1. In other 
words, the weight of a parent is the sum of the weights of 
its children; this is the recurrence relation. Consequently, 
the value of Xk at any node is the sum of the values of 
Xm at the leaves at level M descended from the node. 

For example, in Figure |2] we have 

7 

Xo(0H]Tx3(a), 

3 

Xi(0)=£*3(2a), 

3 

Xi(l)=X)x3(2a + l), 

a=0 

and so on. 

In fact, a more general conclusion is the following: Fix a 
level k. Then the value of Xr at any node (r, a), for r < fe is 
the sum of the values of \k at the level-fc nodes descending 
from the tree node (r, a). 

When the root is [0 : p M — 1], the extreme case, the leaves 
are all singletons and the nodes at level k are each of weight 

p M-k_ 

B. Elementary and Maximal Sets 

To study the structure of universal sampling sets we need a 
series of definitions. When N is a prime power the building 
blocks are the elementary sets: 

Definition 4: A set £ C [0 : p — 1] is a k-elementary set 

if 

Xfe(a) = 1, for all a£ [0 : p k - 1]. 

Note that \£\ = p k . 

As a first application of the formula (l23l l we can add the 
adjective "universal" to the description of elementary sets. 

Lemma 6: A fe-elementary set £ is a universal sampling set. 
Proof: From Xk(o-) — 1 and (f23b it follows that £ has an 
equal number of elements in each congruence class modulo 
p s , s < k. More precisely, 

X s (a)=p k - S , (24) 



for all s < fe. Also from (|23T >. for s > fe all the congruence 
classes are of size or 1, i.e. 

X s (a)e{0,l}. (25) 

Therefore 

\x s (a)-x s (b)\<l, 

for all a, b £ [0 : p k — 1] and all s, and we conclude that £ is 
a universal sampling set. ■ 

Next, a fruitful approach to understanding the structure of 
universal sampling sets is to ask how well an arbitrary index 
set is approximated from within by universal sets. 

Definition 5: Let Z C [0 : N — 1]. A maximal universal 
sampling set for I is a universal sampling set of largest 
cardinality that is contained in Z. 

Note that the definition does not require N to be a prime 
power, though this will most often be the case. There is an 
allied notion of a minimal universal set. We define this in 
Subsection IV-EI below, and show how they are related to 
maximal sets. Maximal and minimal sets enter naturally and 
together in connection with uncertainty principles, discussed 
in Section |VI] 

Finding a maximal universal sampling set for a given X 
is a finitary process, so existence is not an issue. However, 
maximal universal sampling sets need not be unique. For 
example, take N = 3 2 and X = {0,1,2,3,6}. The set X 
is not itself a universal sampling set, and both {0, 1, 2, 3} and 
{0, 1,2,6} are maximal universal sampling sets contained in 
X. 

Despite the lack of uniqueness it will be convenient to have 
a notation, and we let 51(1) denote a generic maximal universal 
sampling set in X. The cardinality |f2(X)| is well-defined; by 
definition \J\ < | O (X) | for any universal sampling set JC1. 

Elementary sets and maximal sets are related through an 
important construction of an elementary set. 

Definition 6: Let X C [0 : p M — 1] and let fe be the largest 
integer such that no congruence class in X/p k is empty. (It 
might be that fe = 0.) Letl| denote an elementary set obtained 
by choosing one element from each congruence class in X/p k . 

By Lemma [6] l| is a universal sampling set, and is of order 
p k . We now have 

Theorem 5: Let X C [0 : p M — 1], and Xj_ as above. Then 

(i) P £ < < P £+1 - 

(ii) There exists a maximal universal sampling set contained 
in X and containing 2|. 

Proof: The lower bound in (i) follows from the definition 
of a maximal set and the comments above, 

/ = \X\\ < \n(z)\. 

To prove the upper bound, suppose JCl has \ J\ > p k+1 . 
By the definition of k at least one congruence class in \_J /p k+1 
is empty, so Xk+i( a '^) — f° r some a € [0 : p k+1 — 1]. 
From the cardinality equation ©, 

P k+1 -i 

^ x k+ i(t;J) = \J\>P* +1 - 

1=0 
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Fig. 2: A tree representing the relations between the congruence classes, and the recurrence relation satisfied by Xfc(a). The 
value of Xfc( a ) at anv node is the sum of the values of Xfc( a ) at its children nodes in level k + 1. 



This implies that at least one congruence class in J /p k+1 has 
at least two elements, or Xk+i(b ; c7) > 2 for some b. We then 
have 

\x- k+ i(b-J)-Xk+i(a;J)\=2>l, 

and J cannot be a universal sampling set. 

For part (ii), we first show that any maximal universal 
sampling set Q(X) set must contain at least one element from 
each congruence class in I/p k . By way of contradiction, 
suppose that Xj,(a;fJ(Z)) = for some a. Since 0,(1) is 
universal we must then have xj.(6;fi(T)) < 1 f° r a U b. By 
©, 

b=0 

contradicting the lower bound in (i). 

Let JC C H(X) be an elementary set, of size p k , that contains 
one element from each congruence class in X/p k , guaranteed 
to exist from what we just showed. Assuming JC ^ it, since 
otherwise we are done, we will use JC and H(I) to construct 
a (new) maximal universal set that contains l|. 

Set up a p-ary tree, as above, with root f^oo = £1(1) and 
(£, a)-node the congruence class 

Qea = {i£ fi(X): i = a mod p e }, \Sl ta \ = xt(a), 

for a G [0 : p £ — 1]. Recall that f2^ a , at level £, is the disjoint 
union of the sets at its children nodes at level t + 1. 

Figure [3] is an example for p = 3 and M > 3, showing only 
three levels for reasons of space. The shading has to do with 
the rest of the proof, as we now explain. 

Both l| and JC are assembled by choosing single elements 
from sets at the nodes in the fc-level (call these the assembly 
nodes) for a total of p k elements for z| and JC each. Observe 
that the sets at the nodes in the k + 1 level are either empty 
or singletons. This is so because by definition of k there must 
be some a € [0 : p k+1 — 1] for which Xk+i( a ) = 0, and hence 
by universality Xk+iify < 1 f° r a U b £ [0 : p k+1 — 1]. And 



then, according to how the tree is structured, the sets at all 
nodes farther down in the tree must as well be either empty 
or singletons. 

Let £ D l| be the set of elements in I that leave the same 

— t 
remainders as do the elements in when divided by p k+1 , 

more precisely, 

C = {j € I: j = i mod p k+1 for some i € 

Likewise let £ ~D JC be 

£' = {j G X: j = i mod p k+1 for some i € JC}. 

£ is the union of the assembly nodes for Xj_ and £' is the union 
of the assembly nodes for JC. The collections may overlap. 

We color a node red if it contributes to C and blue if it 
contributes to and both red and blue (otherwise known as 
purple) if it contributes to both C and £ . In the figure we 
take k = 1, so 2r and JC live at the middle level in the tree, 
as shown. 

Focus on each red node in turn. The red node contains an 
element in X$, say i. 

1) If fl(X) contains an element from this red node, say j 
(which may or may not be equal to i), we replace j E 
Q(I) with i. This neither changes the size of Q(X) nor 
the universality. 

2) Now suppose Q,(X) does not contain an element from 
this red node. We know that the sibling blue node (i.e. 
the blue node that shares the parent with this red node) 
contains an element of JC (and hence of Q(I)), say j. 
Replace j E H(X) with i. This neither changes the size, 
nor the universality; we are just exchanging one element 
from a node with its sibling, so the value of x at the 
parent node does not change. 

These operations preserve size and universality, and repeat- 
ing them for each red node ensures that the resultant set 
contains X\. 
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and | tt ea | = xt(a). 



A stronger version of the upper bound in (i) is the following. 

Corollary 1: Let IC[0: p M — 1] and let k be the smallest 
integer such that Xfc ( a i — for some a. Then, 

< \{a : Xk(a& * 0}|. 

Proof: From the definition of fc, we have Xfc (°o i 2) = 
for some ao- Hence by universality, must satisfy 

|Xfe(6;Q(I))| < 1 for all b, an observation we used above 
and will use again. From the cardinality equation © 

= J2xk(t> ; Q(T)) < \{a : x k (a ;I) ^ 0}|. 

6 

■ 

Ultimately we will show that when iV = p M any maximal 
universal sampling set, and in particular any universal sam- 
pling set, is a disjoint union of elementary sets. In general, 
however, the union of two disjoint, elementary sets need 
not be universal. For example, take N = 2 3 , £ = {0, 1}, 
£' = {4, 5}. Then £ and £' are elementary but their union 
£ U £' = {0,1,4,5} is not universal. What is needed is a 
kind of independence condition on a collection of elementary 
sets. The following lemma, whose converse we will also show, 
makes this latter point precise and introduces the main features 
of the structure of universal sets. 

Lemma 7: Let N = p M . Suppose there exists a finite 
sequence of nonincreasing integers k\ > &2 > ■ ■ ■ > and 
sets £ r C [0 : N - 1], r = 1, 2, . . ., such that 

(i) £ r is k r -elementary. 

(ii) For each r > 1 

£ r n [(J cA =0, 



where 

Cj = {x G [0 : N — 1] : x = e mod p kj+1 for some e G 
Let 

r 

Then X is a universal sampling set. 

Obviously it is condition (ii) that requires further comment. 
The set C r is defined much as in the proof of Theorem [5] and 
we will illustrate the point of (ii) again by means of a tree. 
Observe first that the £ r are disjoint. This follows from (ii), 
since Cj D £j. 

We build a congruence tree with root the full interval [0 : 
N — 1]. Write this as Aoo and write Afka for the congruence 
class of a modulo p k in [0 : N— 1], so that |A^ a | = Xfe( a j [0 : 
N — 1]). All the nodes represent non-singletons, except the 
bottom-most level, M. As before, Figure |4]has p = 3, M > 3 
and shows the tree only up to the third level. 

Suppose k\ = 1, so £\, as an elementary set, contains one 
element from each node at the middle level in the figure. In 
turn, suppose £\ comes from picking one element from each 
of the red nodes. The set C\ is the union of the red nodes. 
Now, the set £2 comes from choosing one element from each 
node at the &2 -level, and the sequence k r is nonincreasing so 
£2 is drawn from nodes in a level at or higher up in the tree 
than £\ (in this example &2 is either 1 or 0). Condition (ii) 
requires that £2 be disjoint from the red nodes, not just from 
£\ which is a (small) subset of the red nodes. 

In the general case, think of k\ as large (eventually it will be 
chosen as in Theorem |5}, so £ \ comes from a level far down 
the tree from the root, and then £ 2, £3, . . . are, at least, no 
further down since k\ > > • • • ■ Condition (ii) requires that 
£ r be assembled from nodes that were not used in assembling 
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any of the £ s for s < r. It is this property that we exploit to 
show that (J £ r is universal. 

Proof of Lemma [7[ Fix r and s with k r < s, and note 

that 

Xs(a;£ r ) G {0,1}, ae[0:p s -l], (26) 

from (|25T l. Now suppose x s (a ;£ r ) = 1, so one element in £ r 
leaves a remainder of a on dividing by p s . Then 



Xs(a,\£ t ) = for all t > r, 



(27) 



i.e., none of the £t for t > r will have an element from 
the congruence class of a modulo p s . This follows (just as 
described for the tree) from £ t n C r — 0, and also 

C r = {x € [0 : iV — 1] : x = e mod p fcr+1 for some e € £ r } 
2 € [0 : iV — 1] : x = e mod p s for some e € £ r }. 

From (f26) and (f27) we conclude that 

Y,Xs(a;£ r ) G {0,1} (28) 

r 

for all a, where the sum is over all r with k r < s. 

With this we can show that I = (J r £ r is universal. For any 

s, and for any a, b G [0 : p s — 1], 

,(o;I) - Xs{b;Z) = ^2xs(a;£ r ) -^2x,s(b;£ r ) 

r r 

Xs{a;£ r )- Xs{b-£ r )\+ (29) 

v r(fc r >s) r(fe r >s) / 

X! Xs{a;£ r )~ ^ Xs(b;£ r ) 

i r (fe r <s) r (fc r <s) 

r(fc r <s) r(fc T .<s) 

(the first two sums cancel, by (|24]>). 



From ( 1281 we have that \x s {a\X) — x s (6;2T)| < 1, so I is 
universal. ■ 



C. A« Algorithm to Construct Maximal Universal Sets 

Consider now the problem of finding a maximal universal 
sampling set contained in a given I C [0 : p M — 1]. Build 
the congruence class tree with root I, as in Figure [2] up to 
level M. The leaves having weight 1 are singletons in I, and 
Xk{a;T), a € [0 : p* 1 — 1], is the total weight at node (k, a). 
The problem of constructing 0,(1) is to pick a subset of the 
leaves so that the tree with root 0,(1) is well balanced at each 
level. By 'well balanced' we mean that at any given level, all 
the subtrees have roughly equal weight, corresponding to the 
condition |^(a ; 0(X)) — x.k{b\Vl(I))\ < 1. The following 
algorithm realizes this and provides the value of |0(2T)j. It 
marries the construction of elementary sets in Theorem |5] with 
an iterative version of the method used in the proof of Lemma 
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Let IC [0 : p M - 1], Initialize with 1 1 = 1, and r = 1. 
1) Let k r be the largest integer such that no congruence class 
in I r /p kr is empty. 



2) Construct an elementary set J.\ C I r by choosing one 
element of X r from each congruence class modulo p kr . 
(There may not be a unique choice, and this is the reason 
why there may be many universal sets contained in 1.) 

3) Define C r D T\ by 

C r = {j G X : j = i mod p kr+1 for some i £ ZJ}. 

4) Let = T r \ C r . Stop if X r +i — 0. Else increment r 
to r + 1 and go to (1). 

Note the following: 

(i) At each step of the algorithm the size of I r is reduced 

by \C r \ > |2J| =p k " > 1: 

|Zr+l| < \Zr\~p kr - 

Since 1 = X\ is a finite set, the algorithm terminates at 
some point. 

(ii) The k r are nonincreasing: 

kl > > &3 > . • . • 

We can now state 

Theorem 6: With fc r , r > 1, defined as above, we have 

\n(i)\=J2p kr - (30) 

r 

One possible maximal universal sampling set is 

fi(I)=LJz?- (3D 

r 

By construction this is a disjoint union. 

Here is an example of the algorithm in action. Let N = 2 5 
and 

X = {0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 14, 15} = X l . 

1) (r = 1) Note that X3(5;2^i) = 0, and that no values 
Xi{i ;24) are zero. Hence k\ = 2. FormlJ by taking one 
element from each congruence class in I\ modulo 2 kl = 
4, e.g. X[ = {0, 1, 2, 3}. Then d = {0, 1, 2, 3, 8, 9, 10} 
is the set of all elements of X\ that leave a remainder 
of 0,1,2 or 3 on dividing by 2 kl+1 = 8. Removing 
such numbers from X\, we have X^ = I\ \ C\ = 
{4,6,7,12,14,15}. 

2) (r = 2) Now xa(l;2<0 = while xi(0;3i), 
Xi(l;22) ^ so k 2 = 1. Let I 2 f = {4,7}. Then 
£2 = {4, 7, 12, 15} is the set of all elements in X2 that 
leave a remainder of 4 mod 4 = or 7 mod 4 = 3 on 
dividing by 2 k2+1 = 4. Removing such numbers from 
X2, we have X3 = X2 \ £2 = {6, 14}. 

3) (r = 3) Now clearly fc 3 = 0. Let X\ = {6}. Then £4 = 
{6, 14}, I4 = and the algorithm terminates. 

According to the theorem, we have |0(X)j = 2 fcl + 2 fe2 + 
2 fc:i = 7, and an example 0(1) is given by X\ U 2| U X\ = 
{0,1,2,3,4,6,7}. 

We have several additional comments. First, we can say 
more about the formula for |f2(I)|. Since the fc r 's are nonin- 
creasing, a typical sequence is, say, 



Q2 times 



^3) ^3> ■ 



Q3 times 
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Fig. 4: Similar to Figure [3]but with root A/oo = [0 : JV — 1], this tree shows the relationship between A/fc a , for p = 3, M > 3. 



with lx > I2 > I3. Given this, equation ( f3Qb appears as 

|n(I)| = a lP h + a 2 p h + a 3 p h +.... (32) 

In fact, effectively, Theorem [6] constructs a base p expansion 
of |0(X) I because each power of p appears at most p— 1 times. 

Corollary 2: Let IC[0: p M - 1]. The formula (03 is of 
the form, 

|n(20l=Ef fer = E a ^ s > 

r s 

with Zi > Z2 > Z3 > . . . and a s 6 [0 : p — 1] for all s. 

Proof: Begin with 0(1) = Ur 2 ?' since the ^ 
disjoint, we have 



^Xi 1+ i(a;lJ) = Xh+i{a; [j 2?) 

r— 1 r— 1 

<X h+1 (a;0(2)), ae[0:p il+1 
Summing this over all a 6 [0 : — 1] we have 

r— 1 r— 1 a 

<X)xi I+ i(o;n(2)) = |n(i)l<P ,1+1 , 



!]■ 



(33) 



so ai < p. For the last inequality in (l33l we have used the 
upper bound from part (ii) in Theorem |5] We have also used 
that \X r | = p kr . The proof for other a s is similar. For example, 
to prove that a 2 < p we start with 

Oil 

Xh+ii^^r) < Xh+i (o;0(2o 1+1 )) 

r-ai + 1 

instead of (f33). ■ 
If the algorithm above were initialized with a universal set 
I, then from Theorem [6] we would obtain I = Vl(T) = lJ r XT. 
This allows us to conclude that any universal set I is a union 



of elementary universal sets. Moreover, the sets X\ defined by 
the algorithm satisfy conditions in Lemma [7] For condition 

(ii) , note that in the algorithm the set T r is recursively defined 
as I r = \ C r -i, so that 

(r-l 

Hence I r n obtained by 

the algorithm, satisfy X\ n fUj=iA'J = ^' smce 2jt CI r 
Putting all these comments together we have the converse of 
Lemma [7] and then adding Theorem [6] we can state 

Corollary 3: I C [0 : p AI — 1] is universal if and only if 
there exist 

(i) A nonincreasing finite sequence k\ > k 2 > ■ ■ ■ > 0, 
with each value of k r repeating at most p — 1 times; 

(ii) Sets T\ C X with l={J r Xp, 
such that 

(iii) X\ is a fc r -elementary universal set; 

(iv) It n (\J~l Cj) = 0, where 



Cj = {x e [0 : JV— 1] : x = i mod + 1 for some i G ij}. 
Note that from (i), (ii) and (iii) we can also conclude that 

r r 

so the k r are the powers of p appearing in the base-p expansion 
of \X\, taken with repetitions. For example with TV = 9, \X\ = 
7 = 2 • 3 1 + 1 ■ 3°, we expect the universal set X = X\ U 
X\ U X\ with x\ and x\ being 1-elementary, and x\ being 0- 
elementary. Corollary |3] implies that the k r read off from the 
base-p expansion of \X\ must be the same as the k r generated 
by the algorithm if X is universal. 
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Remark 4 (Universal sets of prescribed order): 
As it stands, the algorithm finds a universal set of the 
largest size contained in I. With Corollary [3] we can now 
modify the algorithm to solve the following problem: 

Given a set I C [0 : p M — 1], and d < |f2(I)|, find 

a universal set J CI with \J\ = d. 
We follow the algorithm as in steps 1-4, but we change the 
definition of k r in Step 1 . Write the base-p expansion of d with 
repetitions, d — P > reac ' °^ tne as me powers of p that 
appear in the expansion, and arrange the k r in nonincreasing 
order. This ensures that condition (i) in Corollary |3]is satisfied. 
The construction of the J.\ in Steps 2-4 of the algorithm will 
ensure that (iii) and (iv) are satisfied. We conclude that with 
the T\ so obtained by the algorithm the set J = [j-IJ. C 1 
is universal, and it is of the right size by definition of the k r . 
Finally, we have 

Proof of Theorem® As observed above, the X\ generated 
by the algorithm satisfy the hypotheses of Lemma [7] so the 
set Ur-^r i s universal. If we show 

|fi(J)|<5>^, 

r 

then Theorem [6] follows. 
For this we prove 

\Q{2 r )\ < p k - + \Q{2 r+1 )\. (34) 

We appeal to Theorem|5]to find a maximal universal sampling 
set A with IJ C A C I r , and we will show 

A\ll is universal, (35) 

A\llQl r+1 . (36) 

Since \A\ — \Q,(I r )\ These imply 

10(^)1-^ = 1^X41 < P(Zr+l)\: 

which is (|34l . 
First ([35j. Now, 

Xs(a;A\ll) =Xs(a;A)-x*(a-X), [0:p s -l], 
and for s < k r the second term is constant, 

X s {a;l$) =p k r- s , 

from ( l24l i. Since A is universal, \x s (a;A) — x s (b;A)\ < 1 
for all a, b g [0 : p s — 1] and for all s, so we at least have 

\x s (a;A\il)-Xs(b;A\2l)\ < l, 

for all s < k r . We need to check that this inequality continues 
to hold for s > k r + 1. 

As we have argued before, by the definition of k r at least 
one congruence class in I r /p s is empty when s > k r + 1, so 
X s (ao ;Z r ) = for some ao, and because „4 C l r is universal 
we have Xs (a ', A) < 1 for all a. Furthermore, 1^, C A implies 

< X. s (a;A) - x s (a;lt) = x s (a;A\ll). 

Hence the values of x s (a;A\l}.) are in {0,1} and conse- 
quently 

\x s (a;A\il)-Xs(b;A\il)\ < l, 



for all s > k r + 1. This establishes that -4\I,f is universal. 

We prove (l36T l by contradiction. If it were not true that 
A \2J C X r+ i then there would exist an sc g (^4. \Zj) fl £ r . 
Then Xfc r +i(Nfer+l i A) — 2, for on dividing by p fc ' +1 , x 
leaves a remainder of [x]fe r +i (by definition) and so does one 
other element in X\, But this contradicts x s (a;A) < 1 for 
s > k r + 1 from the preceding paragraph. 

This completes the proof of Theorem [6] ■ 
Remark 5: We can give an upper bound for the computa- 
tional complexity of the algorithm for constructing a universal 
sampling set of size d (including constructing a maximal 
universal sampling set). Within an iteration, in the worst case 
the algorithm makes a complete pass over all the nodes of the 
tree once, and the the number of nodes is O(N). Further, the 
number of iterations is ot\ + 012 + ■ ■ ■ + <Xm where 

71 j 1 ]\'T 2 

d = a\p +a 2 p + . . . + aM-iP + Q-M- 

Hence the largest number of iterations is (p — 1)M, and the 
complexity of the algorithm is at most 0(N logJV). 

D. Counting Universal Sets 

The preceding structure theorems allow us to find the 
number of universal sampling sets 1 C [0 : p M — 1] of size d. 
The formula uses the digits from the base-p expansion of d, 
and as above we let 

d = oi\p M ~ l + a 2 p M ~ 2 + . . . + au-iP + otMi 
where < on < p. For i = 0, 1, . . . , M define 

M 
j=i+l 

Hence do = d and djvf = 0. 

Theorem 7: The number of universal sampling sets in [0 : 
pM _ f s j ze ^ j s 

Proof: The proof goes by establishing a recurrence rela- 
tion for C in the d^ Let X be a universal sampling set of size 
d and construct the congruence tree as in Figure |2] with root 
loo = X. We first note that d\ of the nodes at level M — 1 
have weight ai + 1 and the remaining p M_1 — d\ nodes have 
weight ot\, where 

M 

d x =Y J oav M ~ i - 

i=2 

The proof for this is along the same lines as the argument in 
the proof of Theorem[4] (i) (ii). Figure |5] illustrates this. 
The singleton blue nodes at the bottom level are the elements 
of I, and the other nodes (which would be the singletons {6} 
and {7}) are empty. The red nodes at the penultimate level 
represent the nodes that have weight a% + 1 (and there are d\ 
of them). 

2 We are grateful to a reviewer for suggesting a way to make greater use 
of the recursive aspect of our original argument, resulting in a much shorter 
and cleaner proof. 
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Fig. 5: The congruence class tree for N = 8. The universal sampling set {0, 1, 2, 3, 4, 5} of size d = 6 is represented by the 
blue nodes at the bottom level. The red nodes at the penultimate level represent the nodes that have weight 2, the rest of the 
nodes at the penultimate level have weight 1. 



Now remove the bottom level of the tree, effectively making 
N = p , and resulting in Figure [6] If the starting set (the 
blue nodes in Figure [5j is universal, then so must be the set 
formed by the red nodes in Figure |6] Hence the number of 
ways of choosing the red nodes is the same as the number of 
universal sampling sets of size d\ in [0 : p^ 1 " 1 — 1], that is 

c{d uP M : ;- 

Once the red nodes are chosen, we need to choose the blue 
nodes by taking a% + 1 elements from the red nodes and a\ 
elements from the remaining (non-red) nodes, which can be 
done in 

/ \ ^1 / \ p M ~ 1 -di 

( P P 

ways. Hence 

C{d,p M )=( M ( P ) C{d x ,p M ^). (37) 

This full formula follows. ■ 
One special case of the counting formula is easy to evaluate. 
Corollary 4: Let d = p k where k < M. Then the number 

of universal sets of size d in [0 : p — 1] is {p M /d) d . 

In particular when N = 2 M , and d = 2 M ~ 1 = N/2, the 

number of universal sets is 2 N ' 2 . On the other hand, the total 

number of sets of size 2 M_1 in [0 : 2 N — 1] is 

by Stirling's approximation. Hence the fraction of sets that 
are universal is approximately \J nN/2 N ^ 2 , which decreases 
exponentially with N. 

The function C(d,p M ) is certainly complicated, but it 
has some remarkable properties. Though not clear from the 




Fig. 6: Remove the bottom level of the tree in Figure [5] The 
resulting red nodes are a universal sampling set in [0 : 3] 



formula, we have 

C(d,p M )=C(p M -d, P M ). 

This follows from the following lemma, which is itself a 
simple but interesting property of universal sampling sets. 

Lemma 8: If A C [0 : p M - 1] is a universal sampling set 
then so is A' = [0 : p M - 1] \ A. 

This extends the bracelet property of universal sampling 
sets, though for bracelets we need not assume that N is a 
prime power. 

Proof: For any < k < M and a G [0 : p k - 1], 

Xk(a ; A') = Xk(a 5 [0 : p M - 1]) -&(<*; -4) 
= p M - k ~Xk(a;A). 
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Next, since 

\Xk{a-A)-Xk{b-A)\ < 1, 
for all a, b e [0 : p k - 1], it follows that 

\Xk(a;A')-Xk(b;A')\<l. 



Figure Qdisplays log C(d, 5 ) as a function of d as M takes 
increasing values. The plots show the symmetry, C(d,p M ) = 
C(p M — d,p M ), but they show much more. We can observe 
the following: 

(i) There are a series of bumps on several (visible) scales. 
One cannot fail to notice that at each scale the number 
of bumps in the graph is 5, which is the prime p here. 
Experiments with other primes have similar plots and in 
each case indicate that the number of bumps is equal to 
the prime. 

(ii) With increasing M the plots of the count are somehow 
converging in shape - they all start to look similar. 

The second point can indeed be quantified. One can show that 
for each a £ [0,1], 



lim 

M-s-oo 



\ogC{\ap M \,p M ) 



exists. See J7). This compares nicely with the fact that a similar 
function with C(d, N) replaced by (^) also converges, and to 
the entropy function: 



lim 

M— >oo V p 



1 , / P 

M l0 S 



M 

[ap M \ 



a log — 1- (1 — a) log 

a 1 — a 

H{a). 



This is the limiting case of counting all index sets. 
Plots of 



logC(|o:p M ],p M ' 



H„(a) — lim 

M-KX 



0<a<l, (38) 



are shown in Figures [8] and [9] for several values of p, along 
with a plot of H(a). 

The plots of H p (a) seem to satisfy observation (i), that the 
curves have p bumps at each scale. Here is an explanation. In 
the notation of Theorem|7] suppose ai = (i.e., d < p M_1 ). 
Then d\ = d and we have, as in ( 1371 ), 



= p d C(d,p M - 1 ). 

a (with a < p). Then with 



Let M — > oo, so d/p M 
reference to (1381 . 



H p (a) = lim [ log p + % p (pa) )= alogp + % p {pa), 

M^oo \p lu J 

leading to the self-similar plots we observe. 



E. Maximal and Minimal Universal Sampling Sets 

Along with maximal universal sets is the allied notion of 
minimal universal sets. 

Definition 7: Let X C [0 : N — 1]. A minimal universal 
sampling set for 1 is a universal sampling set of smallest 
cardinality that contains 1. 

Again we need a notation and we let $(1) denote a generic 
minimal universal sampling set containing X. Thus |$(I)| < 
\J\ for any universal sampling set J D I. 

Let us show one way that maximal and minimal universal 
sampling sets are related. The proof relies on Lemma [8] from 
the previous subsection. 

Theorem 8: Let I C [0 : p M - 1], I' = [0 : p M — 1]\X. 
Then 

mx)\= P M -\Q(i')\. 

Proof: Let A' = [0 : p M - 1] \ Then A' is universal 
by Lemma[8] Since D 2 we have A' C [0 : N- 1] \X = 
1' and hence 

p M -mx)\ = \A'\<m')\. 

Similarly, let B' = [0 : p M -l]\Q(X'). Then B' is universal, 
it contains [0 : p M — 1] \ X' = X and so 

P M -\n(x')\ = \B'\>mx)\. 

Taken together the two inequalities prove the theorem. ■ 
VI. An Uncertainty Principle, Random Signals, 

AND SUMSETS 

Generally speaking, an "uncertainty principle" is an in- 
equality relating the supports of a nonzero function and its 
Fourier transform, in the present setting /: Zjy — > C, 
and J 7 /: Zjy — > C. The notions of maximal and minimal 
universal sampling sets lead immediately to an additive un- 
certainty principle. Without the language of universality, Tao 
(6) made this connection in the case when N is a prime using 
Chebotarev's theorem, see Corollary |5] though, as he states, it 
was probably already known as a folk theorem. 

Let 

Z(f) = {i:f(i)=0} 

be the zero set of /. The support is the complement of the 
zero set, and we denote it by supp(/). Our result is 
Theorem 9: If / is not the zero function then 



and 



|sup P p7)| >i + |n(z(/))|, 

|supp(/)| > l + \Cl(Z(Tf))\; 

\Z(Ff)\+l< |$(su PP (/))|, 
|Z(/)| + l<|*(supp(J7))|. 



(39) 



(40) 



We are not assuming that N is a prime power here. 
However, we immediately deduce 

Corollary 5 (Tao): If N is prime and / is not the zero 
function then 

|supp(7-/)| + |supp(/)| > N + l. 
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Fig. 7: Plots of logC(d,p M ) vs d for powers of p = 5. Note the 5 bumps on different scales. 
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Fig. 8: Plots of the limit of the counting functions for p = 2, 5 compared to the Entropy function. Note the self-similarity as 
it depends on the prime. 
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Fig. 9: Similar to Figure |8]withp = 3,7 



Proof: If N is prime, then by Chebotarev's theorem every 
index set is universal. In particular the set Z(f) is universal. 
Hence fl(Z(f)) = Z(f). From Theorem |9] 

|supp(^/)|>l + |n(Z(/))| 

= l + |Z(/)| = l + JV-|supp(/)|. 

■ 

We also have 

Corollary 6: Suppose / vanishes on a set of consecutive 
integers X. Then |supp(J r /)| > \L\ + 1. If J is a set of 
integers such that Tf{J) = 0, then \X\ + \J\ < N - 1. 

Proof: We observed previously that any set of consecutive 
integers, 2" in this case, is universal. Since I C Z(f), we have 
1^ (£(/)) I > |X|. From Theorem|9] this implies |supp(J"/)| > 
\T\ + 1. Further, if Tf(J) = then JV — |^| > |supp(J"/)| 
and so N - \ J\ > \2\ + 1. ■ 

The proof of Theorem [9] itself is very brief. 

Proof of Theorem® Suppose |supp(J"/)| < \Q(Z(f))\. 
From n(2(f)) C Z(f) it follows that / vanishes on ft(Z(f)). 
Since Q,(Z) is a universal sampling set this implies that T f 
i i. contradicting the assumption that / is not the zero function. 
This proves the first statement in 
establishes the second statement. 

For the proof of ( l40| >, write Z 
$(supp(/)). Then 

Ff{Z) = and so E\T f = 0. 

However / is supported within A, and so we may write / = 
Ej^g, where g = f(A) ^ 0. This means we must have 

E\TE A g = 0, for some g ^ 0, (41) 

i.e. the columns of E\TE a are dependent. This is expected 
if \Z\ < \A\. However, if \Z\ > \A\, this contradicts the 
universality of A. Hence we must have \Z\ < \A\ — 1, which 
is the first inequality in j40t . A similar argument establishes 
the second statement. ■ 



A similar argument 
Z(Tf) and A = 



It is interesting that when N is a prime power the two 
statements d39l and ( |40b are equivalent. To see this we first 
derive (l40b from (1391 when N = p M . This appeals to Theorem 
|8] on the relation between maximal and minimal sets, with 
supp(/) = [0 : N-l]\Z(f). Thus, from ([39]l, |supp(J"/)| > 
1 + \il(Z(f))\, and substituting from Theorem [8] 

|SU PP (J-/)| >l + JV-|$(BUpp(/))|. 

But |su PP (J-/)| =N- \Z{Ff)\, so 

N-\Z{Tf)\ >l + iV-|$(supp(/))|, 

which is the same as the first statement in (I40t . Again, the 
second statement in (|40T > follows in a similar manner. We could 
have started instead with (t40b and from this derived (l39"V 

In cases where Z(f) itself is a universal sampling set, the 
uncertainty principle in Theorem [9] can be as strong as the 
uncertainty principle for the prime N case. 

Remark 6: Readers familiar with the seminal paper of 
Donoho and Stark [ 14| will wonder if the additive uncertainty 
principle in Theorem [9] can be applied to the problem of 
reconstruction of a signal corrupted by sparse noise. (See also 
[15 1 for more recent work.) The answer is yes, and we refer 
to Ifl6ll. 

A. Random Index Sets and Random Signals 

We will give several applications of these ideas. First we 
combine Theorem|9]with a probabilistic estimate on the size of 
a maximal universal sampling set for randomly chosen index 
sets. We must revert to the assumption that N is a prime power. 

Theorem 10: Let N — p M . Let 1Z S be an index set of s 
numbers chosen at random from [0 : N — 1]. Let A = (N — 
s)/N. If d,S > satisfy 



iVlog(l/A) > (l + 6)dlogd, 
then |fi(7£ s )| > d with probability at least 1 - d~ s . 



(42) 
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This means that if we can choose a large d satisfying ( |42] |. 
which is possible, for example, if N is large and A is small, 
then |0(72. s )| > d with high probability. Thus while it is 
unlikely that a randomly chosen index set will be universal, 
it is quite likely that such an index set will contain a large 
universal set as a subset. 

We will apply Theorem [10] to the case when 1Z S is the zero 
set of /: Zjy — > C. Then A = |supp(/)|/iV, i.e., A is the 
fraction of nonzero entries in /. 

Proof: The proof uses the bound in part (ii) of Theorem|5] 
Let k be the largest integer such that no congruence classes in 
lZ s /p k are empty. Note that k is random since 1Z S is random. 
Then |f2(7£ s )| < d — 1 implies 

P k < \n(K s )\<d-i, 

by Theorem [5] Therefore 

Prob (\£l(K s )\ < d - 1) < Prob(/ < d - 1) 
= Prob(fc< Llogp(d-l)J) 
= Prob(at least one congruence class in 



(43) 



We will compute the last probability. 

Let b = [\og p (d- 1)J +1, and let Afba be the set of 
elements in [0 : N— 1] that leave a remainder of a G [0 : p b — 1] 
when divided by p b . Since N = p M all of the Afba have size 
t = N/p b =p M / p Llog P (d-i)J+i, 

Fix a particular residue a. The probability that Afba H 1Z S is 
empty (in words, the probability that a particular congruence 
class goes missing in K s ) is ( N ^ t )/( N S )- This is because the 
number of ways of picking 1Z S is f ) while the number of 
ways of picking 1Z S so that Afba H TZ S = is the number of 
ways of picking s elements from 



|[Q:JV-1]\M.|=JV-* 



elements. Then 



Prob (Af ba nn s = Q) 
'N-t\ //JV> 



s / ' \s 

(N-(t- 1) - s)(N — (t — 2) — s) ...(N — s) 



1 



(N -t + l)(N -t + 2)...N 



1 - 



l-± 

N 



N-t + 1 ) V N -t + 2 



From this, 



Prob(at least one congruence class in 

7e s / p Lio gj) (d-i)j+i is empty ) 
= Prob (\J (Mba nn s = 0)^J 

< ^Prob(AA 6a n^ s =0) 

i 

N ( s \* , , 



(45) 



Hence we have from (|45T >. 

Prob < d - 1) < NX*/t. 



(46) 



Now, t = iV/p Llos P (d " 1)J+1 > iV/d, since [xj < x. Using 
this in d46l 

Prob(|^(^ s )| < d - 1) 
< NX* ft < d\ N ' d 

ATlog(l/A) 



= exp log d — 



= exp log d 1 — 



;(i/A) \ 



d log d / ( 

< exp (— Jlogd) (from the hypothesis of the theorem) 
= d~ s . (47) 

We conclude that Prob (|fi(ft s )| > d) > 1 - d" 5 . ■ 
We can now state a probabilistic uncertainty principle. 
Afterward we will comment on how this compares to the result 
of Candes, Romberg and Tao 131 . 

Theorem 11: Let N — p AI . Let Qn,t be the set of all signals 
g: Zjv — > C with support of size r. Let g £ Gn,v be a signal 
whose support is drawn at random from the set of all index 
sets of size r. Let the values of g on the support set be drawn 
according to some arbitrary distribution. For 5 > let 

N 

a N,S = „ , fvt— —r (1 + lo g(! + 5) + log log N) . 



Then 



(1+5) logiV 
l su PP(.9)l + |supp(J"5)| > 1 + a Nt , 



(48) 



with probability at least 1 — (ajv,a — r)~ d . 

If r is small compared to cln,s, Theorem [TT] states that 
almost all signals g in Qn,t satisfy the uncertainty principle 
above; roughly speaking 

|supp( ff )| + |supp(J" 5 )| > N(l + log log TV)/ log TV 

for most g. 

Proof: Picking the support of g at random among sets 
of size r is equivalent to picking the zero set of g at random 
among all index sets of size N — r. The proof now makes use 
of Theorem [Tol to get a lower bound on \il(Z(g))\. For this 
we need to choose d, 6 so that 



iVlog(l/A) = N log N/r > (1 + <5)dlogd. 



(49) 



Fix any 6 > and let d = N\og(N/r)/(l + J) log AT. We 
check that d, S satisfy j49l : 



(i + 5)dio g d= jvi 1 ° s( y-io g 



< 



logiV 
7Vlog(iV/r) 



N\og(N/r) 



(1 + 5) \ogN 
logN = N log N/r, 



logN 

Then from Theorem [10] 

\fl(Z(g))\ > N\og(N/r)/ (1 + 6) logN 

with probability 1 — d~ 6 . From the uncertainty principle 
Theorem [9] we now have 

|supp(^.g)| >l + \n(Z(g))\ 

> I + N\og(N/r)/(l + 5)\ogN 
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with probability 1 — dr s . 

The final step in the proof uses a lower bound on d = 
N\og(N/r)/(l + 5)\ogN. We have set apart this technical 
result as Lemma [9] below. This gives 

|supp(J"g)| > 1 + ajv,6 - r 

with probability 1 - d~ s . Since 1 - d~ s > 1 - (ajv.a - r)~ 5 , 
we can say 

|supp(J"g)| > 1 + a Nt s - r 

with probability 1 — (ojv.j — r)~ d . The result follows since 
r = |supp(.g)|. ■ 
Lemma 9: Let 

, Nlog(N/r) 



(1 + 5) log TV 



and 



N 



aN.s 



■(l + log(l + <y) + bglogJV) 



(l + £)log/V 

as in Theorem [TT] Then d > g — r. 

Proof: The convex function log(N/r) is bounded below 
by its tangent at any point ro > 0. Thus 



For 



log(/V/r) > Iog(JV/r ) + (r - r ) 

' r 



N 



(1 + 6) \ogN ' 
this reads 

\og(N/r) > log ((1 + 5) log N) 

(i + <y)iogjv 



iV 



V' (1 + 5) log TV 
Multiplying by N/(l + 8) log TV, we have 

N\og(N/r) 



d = 



(1 + 5) log TV 



^ TV log ((1 + 5) log TV) 

> — ~ | r- 



N 



(1 + 5) log TV 



1 + <5) log N 
N 

(log(l + 5) + 1 + log log N) — r 



(1+5) logiV 



Remark 7: The robust uncertainty principle of Candes, 
Romberg and Tao in |3| is as follows: for M > there exists 
a constant Cm such that 



|supp( ff )| + |supp(^ S )| > C M N(\ogN) 



-1/2 



with probability 1 — 0(N M ). This inequality is stronger than 
that of Theorem HU by about (log TV)" 1 / 2 . Also, Theorem [fTl 
holds for N = p M , whereas the inequality above holds for all 
N. 

In our proof of Theorem [10] we have only used the bound 
|f2(Z(g))| > p k from Theorem [5] By using the exact formula 
for | Q (Z(g)) \ in Theorem |6] (or by a better lower bound) 
it might be possible to tighten the uncertainty principle of 
Theorem QT] and remove the factor (log N)~ x / 2 . 



B. Sumsets and the Cauchy-Davenport Theorem 

Our final application is a generalization of the Cauchy- 
Davenport theorem ifm . from additive number theory, on 
the size of sumsets. Again the inspiration comes from Tao's 
approach, [6|, to the original Cauchy-Davenport theorem via 
Chebotarev's theorem. 

Theorem 12: Let X, y C [0 : N - 1}. If either X or y is a 
universal sampling set, then 

l* + y|>l*l + i;y|-i, (so) 

when \X\ + \y\-l<N. 

Here X + y is the sumset defined as 

x + y = {x + y.xex, y ey}, 

where the addition is modulo N . 

We are not assuming that N is a prime power, while the 
classical theorem has N = p and there are no assumptions on 
X or y. That form of the result follows from Theorem [12] 
since all index sets in [0 : N — 1] are universal when N is 
prime. 

As a corollary we get a statement on the size of \X + y\ 
without making an assumption on X or y. 

Corollary 7: Let X, y C [Q : N - 1] be index sets. Then, 

\x + y\>m&x{\n(x)\ + \y\-i,\x\ + \n(y)\-i}. (5i) 

Proof: Since Q(X) C X, it follows that Q(X) + y C 
X + y. Now, 

\X + y\> \Q(X) +y\> \n(X)\ + \y\-l from TheoremUll 

The inequality \X + y\> \X\ + \fl(y)\ - 1 follows similarly. 

■ 

Proof of Theorem\l2\ First note that (f50T > follows trivially 
when either X or Y is a singleton. (More precisely, if, say, X 
is a singleton, then X + y is just a translate of y, and so (T50b 
holds with equality). For the rest of the proof, we assume that 
|*|, |y| > 2. Let \X\ =r, \y\ = s. 
Assume without loss of generality that X is universal. Let 

fi e M x be such that / x ([l : r]) = (0, 0, . . . , 0, 1). 

r — 1 times 

Such an fx exists because the set [1 : r], as an index set of 
r consecutive integers, is a universal sampling set, so is in 
particular a sampling set for M x . Similarly let 



f 2 e B y be such that / 2 ([ 



•*-!]) = (0,0,. ..,0,1), 



s— 1 times 



again possible because [7- : r + s — 1] is a set of s consecutive 
integers, and hence a sampling set for B^. Note that /1/2 € 
M x+y and so \X + y\> supp(J"(/i/ 2 )). Note also that the 
zero set Z(fif 2 ) of /1/2 contains [1 : r + s — 2], and hence, 
since the latter is a universal sampling set, |fi (Z(f\f2)) \ > 
r + s-2 = \X\ + \y\-2. 

Now we apply the uncertainty principle of Theorem |9] to 
/l/ 2 . We have, so long as /i/ 2 ^ 0, 

|* + y|>su PP (T(hh)) 

>i + \n(z(M 2 ))\ 

>i + \x\ + \y\-2 = \x\ + \y\-i, (52) 
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So we have proved that \X + y\ > \X\ + \y\ — l if we know 
that hh £ 0. 

For this, again from Theorem |9] we have 

\Z{h)\ < |$(supp(^/i))| - 1 < MX)\ - 1, 

since fx £ M . But X is universal, so &(X) = X and 

\Z(h)\ < \X\ - 1. (53) 

By definition of fx, the set [1 : r — 1] = [1 : |Af| — 1] is already 
in Z(fx). Together with (l53l , this implies that fx cannot have 
any more zeros. In particular, fx(r + s—l) ^ 0. Since f% (r + 
s — 1) = 1, /1/2 cannot be identically zero and (|52l applies. 

■ 

An important generalization of the Cauchy-Davenport theo- 
rem to any finite abelian group, not necessarily of prime order, 
is due to Kneser, ifTHl . 

Theorem 13 (Kneser): Let G be a finite abelian group. Let 
A, B C G be non empty subsets of G. Let H be the set of 
periods, defined by H = {h G G : h + [A + B) = A + B}. 
(Thus A + B is periodic if H ^ {0}.) Then 

\A + B\>\A\ + \B\-\H\. 

Hence unless A + B is periodic, \A + B\ > \ A\ + \B\ - 1. 

Though the form is similar, this result neither implies nor 
is implied by Theorem [12] We give two examples. Let N = 
8, X = {0,1}, y = {0,4}. Then X is universal and X + 
y = {0,1,4,5} is periodic with period 4. So Theorem [T2l 
applies, but Kneser's theorem does not. Next let N — 16, 
X = {0, 2}, y = {0, 2, 4}. Then X + y = {0, 2, 4, 6, 8, 10}, 
which is not periodic, and neither X nor y is universal. So 
Kneser's theorem applies, but Theorem Q~2] does not. We hope 
to understand this more thoroughly. 

Appendix A 

Condition Number Associated with the Universal 
Sampling Set 1* 

An index set of consecutive integers is the simplest universal 
sampling set, but there is a catch in using it. Let I be a 
universal sampling set of size d, f 6 C N , and fx the d- vector 
obtained from / by sampling at locations in I. If / is in some 
bandlimited space B^, \J\ = d, then the interpolation formula 
(0]i reads 

/ = TEjiElTEj)- 1 f x . 

The practical difficulty is the computation of the inverse of 
ExJ-Ej. Suppose we use I — I* = [0 : 1] as a universal 
sampling set. We give a lower bound on the condition number 
of ExJ-Ej that can be quite large for some J, even though 
the matrix E^TEj is invertible for all J . 
ForZ= [0 : d- 1], note that 

|det (E^Ej)\ = \det(Q) i&x , j ej\ 

= 11 ia-c j21 



N 



JU2EJ 

n 



2 sin 



Mh - .72) 



N 



If {t7j} are the singular values of A — E^FEj, then 

det(A) = cricr 2 cr 3 . . . a d > a^ in . (54) 
Also if a r k — exp(—2irirjk/N) are the entries of A, then 



d-l 



d-l 



d 2 = £ kfcl 2 = tx(A*A) = ]T al < da 2 ma 



(55) 



and so criL 



From 



,fc=0 

« > d. 
and 



> Vd 



r=0 



, the condition number satisfies 

\ l/2d 



Vn jlj2 ^|2sinH£(2-M| 



A possible scenario may be when d is very small and TV is 
very large. In this case, the condition number can be very large 
if the frequency slots J are clustered. 

Appendix B 
Counting Bracelets 

Several of our results, Theorem |4] for example, depend 
only on the bracelet of an index set rather than on the index 
set itself. Thus it is useful to know how many bracelets 
there are and how to enumerate them. Counting bracelets - 
actually, multicolored bracelets - is a standard application in 
combinatorics of the orbit stabilizer theorem, and the problem 
is treated in many places. Our situation is slightly different 
because we want a count that specifies the number of black 
beads in a black-and-white bracelet, corresponding to the size 
of the index set that determines the locations of the black 
beads. Nevertheless, the orbit stabilizer theorem can still be 
applied, and we have the following results. 

Theorem 14: Let (j> denote Euler's totient function. When 
N is odd, the number of black-and-white bracelets of length 
N with exactly d black beads is 



W(JV-l)/2\ , j_ v 4>{k) (N/k\ 

2 I d/2 J ' 2N l^k\NM\d N \d/k) 



d/2 

W(JV-1)/2N 
2V(d-l)/2/ 



J_ v 0(fc) (N/k\ 

2N l^k\N,k\d N \c 



for even d, 



for odd d. 



<e\N,k\d N \d/kJ 

When N is even, the number of black-and-white bracelets 
of length N with exactly d black beads is 



1 (N/2\ , 1_ ^(fe) (N/k\ 

2 \d/2) " r 2N l^k\N,k\d N \d/k) 



2 V (d-\)/2) ~ 2N 2~ik\N,k\d N \d/k) 



<P(k) (N/k\ 



for even d, 



for odd d. 



We omit the proof; see 1121 . An efficient algorithm for 
enumerating bracelets has been devised only recently by 
Sawada fl9l . An algorithm for determining when two index 
sets are in the same necklace is due to J. P. Duval l20| . It can 
also be used for bracelets. See [2| for examples of both of 
these. 

Appendix C 
Additional References 

Though our work has concerned discrete-time signals ex- 
clusively, there is also a notion of universal sampling sets 
for continuous-time signals. We will not give the definition; 
it is interesting and not clear what the relations between 
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the two may be. Here we cite only a few sources, starting 
with the paper of Landau [21] that featured the renowned 
necessary density condition on sampling sets. More recently, 
many interesting results have been obtained by Olevskii and 
Ulanovskii J22), 11231 on universal sampling and stable recon- 
struction, by Matei and Meyer ll24l . who work with lattices 
and make contact with compressed sensing, and by Bass and 
Grochenig (25), who consider random sampling. Of course, 
anyone writing on so fundamental a topic as sampling and 
interpolation will encounter an enormous literature, and most 
probably miss an equal or greater amount. We apologize to 
the authors of works we have missed. 
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